Overview

Dataset statistics

Number of variables9
Number of observations3192
Missing cells7224
Missing cells (%)25.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory224.6 KiB
Average record size in memory72.0 B

Variable types

Categorical1
Numeric8

Alerts

dt has a high cardinality: 3192 distinct values High cardinality
LandAverageTemperature is highly correlated with LandAverageTemperatureUncertainty and 3 other fieldsHigh correlation
LandAverageTemperatureUncertainty is highly correlated with LandAverageTemperature and 3 other fieldsHigh correlation
LandMaxTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandMaxTemperatureUncertainty is highly correlated with LandAverageTemperatureUncertainty and 2 other fieldsHigh correlation
LandMinTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandMinTemperatureUncertainty is highly correlated with LandAverageTemperatureUncertainty and 2 other fieldsHigh correlation
LandAndOceanAverageTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandAndOceanAverageTemperatureUncertainty is highly correlated with LandAverageTemperatureUncertainty and 2 other fieldsHigh correlation
LandMaxTemperature has 1200 (37.6%) missing values Missing
LandMaxTemperatureUncertainty has 1200 (37.6%) missing values Missing
LandMinTemperature has 1200 (37.6%) missing values Missing
LandMinTemperatureUncertainty has 1200 (37.6%) missing values Missing
LandAndOceanAverageTemperature has 1200 (37.6%) missing values Missing
LandAndOceanAverageTemperatureUncertainty has 1200 (37.6%) missing values Missing
dt is uniformly distributed Uniform
dt has unique values Unique

Reproduction

Analysis started2022-09-18 18:20:02.974350
Analysis finished2022-09-18 18:20:32.150837
Duration29.18 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

dt
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct3192
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size25.1 KiB
1750-01-01
 
1
1926-09-01
 
1
1926-11-01
 
1
1926-12-01
 
1
1927-01-01
 
1
Other values (3187)
3187 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters31920
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3192 ?
Unique (%)100.0%

Sample

1st row1750-01-01
2nd row1750-02-01
3rd row1750-03-01
4th row1750-04-01
5th row1750-05-01

Common Values

ValueCountFrequency (%)
1750-01-011
 
< 0.1%
1926-09-011
 
< 0.1%
1926-11-011
 
< 0.1%
1926-12-011
 
< 0.1%
1927-01-011
 
< 0.1%
1927-02-011
 
< 0.1%
1927-03-011
 
< 0.1%
1927-04-011
 
< 0.1%
1927-05-011
 
< 0.1%
1927-06-011
 
< 0.1%
Other values (3182)3182
99.7%

Length

2022-09-18T13:20:32.341267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1750-01-011
 
< 0.1%
1752-01-011
 
< 0.1%
1751-12-011
 
< 0.1%
1750-03-011
 
< 0.1%
1750-04-011
 
< 0.1%
1750-05-011
 
< 0.1%
1750-06-011
 
< 0.1%
1750-07-011
 
< 0.1%
1750-08-011
 
< 0.1%
1750-09-011
 
< 0.1%
Other values (3182)3182
99.7%

Most occurring characters

ValueCountFrequency (%)
18158
25.6%
06728
21.1%
-6384
20.0%
92138
 
6.7%
82138
 
6.7%
71538
 
4.8%
21288
 
4.0%
5950
 
3.0%
6938
 
2.9%
3830
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number25536
80.0%
Dash Punctuation6384
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
18158
31.9%
06728
26.3%
92138
 
8.4%
82138
 
8.4%
71538
 
6.0%
21288
 
5.0%
5950
 
3.7%
6938
 
3.7%
3830
 
3.3%
4830
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
-6384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common31920
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
18158
25.6%
06728
21.1%
-6384
20.0%
92138
 
6.7%
82138
 
6.7%
71538
 
4.8%
21288
 
4.0%
5950
 
3.0%
6938
 
2.9%
3830
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII31920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18158
25.6%
06728
21.1%
-6384
20.0%
92138
 
6.7%
82138
 
6.7%
71538
 
4.8%
21288
 
4.0%
5950
 
3.0%
6938
 
2.9%
3830
 
2.6%

LandAverageTemperature
Real number (ℝ)

HIGH CORRELATION

Distinct2839
Distinct (%)89.3%
Missing12
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean8.374731132
Minimum-2.08
Maximum19.021
Zeros0
Zeros (%)0.0%
Negative17
Negative (%)0.5%
Memory size25.1 KiB
2022-09-18T13:20:32.566696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-2.08
5-th percentile1.97075
Q14.312
median8.6105
Q312.54825
95-th percentile14.395
Maximum19.021
Range21.101
Interquartile range (IQR)8.23625

Descriptive statistics

Standard deviation4.381309771
Coefficient of variation (CV)0.5231582604
Kurtosis-1.342072459
Mean8.374731132
Median Absolute Deviation (MAD)4.1565
Skewness-0.08142566548
Sum26631.645
Variance19.19587531
MonotonicityNot monotonic
2022-09-18T13:20:32.799976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.7654
 
0.1%
13.2934
 
0.1%
2.0393
 
0.1%
11.0973
 
0.1%
14.2423
 
0.1%
12.2473
 
0.1%
2.7373
 
0.1%
14.7423
 
0.1%
3.0993
 
0.1%
3.923
 
0.1%
Other values (2829)3148
98.6%
(Missing)12
 
0.4%
ValueCountFrequency (%)
-2.081
< 0.1%
-1.5031
< 0.1%
-1.4311
< 0.1%
-1.3851
< 0.1%
-1.2491
< 0.1%
-0.9781
< 0.1%
-0.8371
< 0.1%
-0.8111
< 0.1%
-0.8061
< 0.1%
-0.7931
< 0.1%
ValueCountFrequency (%)
19.0211
< 0.1%
17.911
< 0.1%
17.611
< 0.1%
17.1151
< 0.1%
16.8211
< 0.1%
16.5211
< 0.1%
16.4681
< 0.1%
16.3911
< 0.1%
16.1831
< 0.1%
16.0251
< 0.1%

LandAverageTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1594
Distinct (%)50.1%
Missing12
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean0.9384679245
Minimum0.034
Maximum7.88
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:33.047958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.034
5-th percentile0.066
Q10.18675
median0.392
Q31.41925
95-th percentile3.2351
Maximum7.88
Range7.846
Interquartile range (IQR)1.2325

Descriptive statistics

Standard deviation1.096439795
Coefficient of variation (CV)1.168329536
Kurtosis3.536050467
Mean0.9384679245
Median Absolute Deviation (MAD)0.31
Skewness1.780596521
Sum2984.328
Variance1.202180224
MonotonicityNot monotonic
2022-09-18T13:20:33.310095image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.08720
 
0.6%
0.06419
 
0.6%
0.07716
 
0.5%
0.07816
 
0.5%
0.06814
 
0.4%
0.08614
 
0.4%
0.08214
 
0.4%
0.08513
 
0.4%
0.08413
 
0.4%
0.0712
 
0.4%
Other values (1584)3029
94.9%
ValueCountFrequency (%)
0.0341
 
< 0.1%
0.0352
0.1%
0.0361
 
< 0.1%
0.0391
 
< 0.1%
0.041
 
< 0.1%
0.0413
0.1%
0.0422
0.1%
0.0442
0.1%
0.0453
0.1%
0.0463
0.1%
ValueCountFrequency (%)
7.881
< 0.1%
7.4921
< 0.1%
7.3491
< 0.1%
6.4151
< 0.1%
6.3411
< 0.1%
5.9661
< 0.1%
5.9631
< 0.1%
5.8571
< 0.1%
5.8271
< 0.1%
5.6561
< 0.1%

LandMaxTemperature
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct1814
Distinct (%)91.1%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean14.3506009
Minimum5.9
Maximum21.32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:33.580985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5.9
5-th percentile8.118
Q110.212
median14.76
Q318.4515
95-th percentile20.1625
Maximum21.32
Range15.42
Interquartile range (IQR)8.2395

Descriptive statistics

Standard deviation4.309578966
Coefficient of variation (CV)0.3003065164
Kurtosis-1.456171165
Mean14.3506009
Median Absolute Deviation (MAD)4.137
Skewness-0.09693800875
Sum28586.397
Variance18.57247086
MonotonicityNot monotonic
2022-09-18T13:20:33.811617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.7813
 
0.1%
8.5553
 
0.1%
19.7533
 
0.1%
11.4113
 
0.1%
17.7133
 
0.1%
20.0373
 
0.1%
17.1963
 
0.1%
19.363
 
0.1%
19.853
 
0.1%
19.9873
 
0.1%
Other values (1804)1962
61.5%
(Missing)1200
37.6%
ValueCountFrequency (%)
5.91
< 0.1%
6.4211
< 0.1%
6.4361
< 0.1%
6.6421
< 0.1%
6.6791
< 0.1%
6.6861
< 0.1%
6.8641
< 0.1%
6.9611
< 0.1%
7.0231
< 0.1%
7.0641
< 0.1%
ValueCountFrequency (%)
21.321
< 0.1%
21.1991
< 0.1%
21.1081
< 0.1%
21.0851
< 0.1%
21.0062
0.1%
20.9721
< 0.1%
20.971
< 0.1%
20.9231
< 0.1%
20.9221
< 0.1%
20.9051
< 0.1%

LandMaxTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct841
Distinct (%)42.2%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.4797816265
Minimum0.044
Maximum4.373
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:34.075782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.044
5-th percentile0.083
Q10.142
median0.252
Q30.539
95-th percentile1.86245
Maximum4.373
Range4.329
Interquartile range (IQR)0.397

Descriptive statistics

Standard deviation0.5832029575
Coefficient of variation (CV)1.215559174
Kurtosis7.55348287
Mean0.4797816265
Median Absolute Deviation (MAD)0.136
Skewness2.565863891
Sum955.725
Variance0.3401256896
MonotonicityNot monotonic
2022-09-18T13:20:34.309189image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.09314
 
0.4%
0.10512
 
0.4%
0.09811
 
0.3%
0.09411
 
0.3%
0.10611
 
0.3%
0.1311
 
0.3%
0.1610
 
0.3%
0.09910
 
0.3%
0.17910
 
0.3%
0.099
 
0.3%
Other values (831)1883
59.0%
(Missing)1200
37.6%
ValueCountFrequency (%)
0.0441
 
< 0.1%
0.0481
 
< 0.1%
0.0521
 
< 0.1%
0.0553
0.1%
0.0561
 
< 0.1%
0.0572
0.1%
0.0582
0.1%
0.0592
0.1%
0.062
0.1%
0.0611
 
< 0.1%
ValueCountFrequency (%)
4.3731
< 0.1%
4.241
< 0.1%
4.1641
< 0.1%
3.7511
< 0.1%
3.4911
< 0.1%
3.361
< 0.1%
3.3391
< 0.1%
3.1881
< 0.1%
3.1871
< 0.1%
3.1841
< 0.1%

LandMinTemperature
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct1873
Distinct (%)94.0%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean2.743595382
Minimum-5.407
Maximum9.715
Zeros0
Zeros (%)0.0%
Negative684
Negative (%)21.4%
Memory size25.1 KiB
2022-09-18T13:20:34.548285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-5.407
5-th percentile-3.32445
Q1-1.3345
median2.9495
Q36.77875
95-th percentile8.51045
Maximum9.715
Range15.122
Interquartile range (IQR)8.11325

Descriptive statistics

Standard deviation4.15583532
Coefficient of variation (CV)1.514740602
Kurtosis-1.433529954
Mean2.743595382
Median Absolute Deviation (MAD)4.088
Skewness-0.05025501431
Sum5465.242
Variance17.27096721
MonotonicityNot monotonic
2022-09-18T13:20:34.791186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.1613
 
0.1%
7.8183
 
0.1%
-1.1393
 
0.1%
7.8923
 
0.1%
8.1843
 
0.1%
3.0252
 
0.1%
-2.522
 
0.1%
-1.1622
 
0.1%
-1.432
 
0.1%
0.6292
 
0.1%
Other values (1863)1967
61.6%
(Missing)1200
37.6%
ValueCountFrequency (%)
-5.4071
< 0.1%
-5.3451
< 0.1%
-4.9471
< 0.1%
-4.7171
< 0.1%
-4.6781
< 0.1%
-4.6211
< 0.1%
-4.5581
< 0.1%
-4.5191
< 0.1%
-4.4651
< 0.1%
-4.3651
< 0.1%
ValueCountFrequency (%)
9.7151
< 0.1%
9.6841
< 0.1%
9.5691
< 0.1%
9.5511
< 0.1%
9.4821
< 0.1%
9.4561
< 0.1%
9.4281
< 0.1%
9.4091
< 0.1%
9.4071
< 0.1%
9.3441
< 0.1%

LandMinTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct781
Distinct (%)39.2%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.4318488956
Minimum0.045
Maximum3.498
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:35.064881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.045
5-th percentile0.08455
Q10.155
median0.279
Q30.45825
95-th percentile1.3948
Maximum3.498
Range3.453
Interquartile range (IQR)0.30325

Descriptive statistics

Standard deviation0.4458378371
Coefficient of variation (CV)1.032393139
Kurtosis7.0548683
Mean0.4318488956
Median Absolute Deviation (MAD)0.135
Skewness2.384389692
Sum860.243
Variance0.198771377
MonotonicityNot monotonic
2022-09-18T13:20:35.293022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.23712
 
0.4%
0.08211
 
0.3%
0.14511
 
0.3%
0.1311
 
0.3%
0.12611
 
0.3%
0.33810
 
0.3%
0.22410
 
0.3%
0.21310
 
0.3%
0.1279
 
0.3%
0.1259
 
0.3%
Other values (771)1888
59.1%
(Missing)1200
37.6%
ValueCountFrequency (%)
0.0451
 
< 0.1%
0.0471
 
< 0.1%
0.0513
0.1%
0.0531
 
< 0.1%
0.0542
0.1%
0.0551
 
< 0.1%
0.0581
 
< 0.1%
0.063
0.1%
0.0612
0.1%
0.0622
0.1%
ValueCountFrequency (%)
3.4981
< 0.1%
3.4281
< 0.1%
2.9631
< 0.1%
2.9291
< 0.1%
2.8431
< 0.1%
2.8221
< 0.1%
2.7951
< 0.1%
2.7141
< 0.1%
2.5941
< 0.1%
2.561
< 0.1%

LandAndOceanAverageTemperature
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct1596
Distinct (%)80.1%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean15.21256576
Minimum12.475
Maximum17.611
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:35.545321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12.475
5-th percentile13.3
Q114.047
median15.251
Q316.39625
95-th percentile17.0166
Maximum17.611
Range5.136
Interquartile range (IQR)2.34925

Descriptive statistics

Standard deviation1.274092954
Coefficient of variation (CV)0.08375266699
Kurtosis-1.322464368
Mean15.21256576
Median Absolute Deviation (MAD)1.179
Skewness-0.05604937795
Sum30303.431
Variance1.623312857
MonotonicityNot monotonic
2022-09-18T13:20:35.792912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15.0055
 
0.2%
16.5964
 
0.1%
13.264
 
0.1%
13.3114
 
0.1%
15.9274
 
0.1%
16.4964
 
0.1%
16.8464
 
0.1%
16.7834
 
0.1%
13.7043
 
0.1%
13.543
 
0.1%
Other values (1586)1953
61.2%
(Missing)1200
37.6%
ValueCountFrequency (%)
12.4751
< 0.1%
12.621
< 0.1%
12.6581
< 0.1%
12.7021
< 0.1%
12.7321
< 0.1%
12.8281
< 0.1%
12.8331
< 0.1%
12.8391
< 0.1%
12.841
< 0.1%
12.8791
< 0.1%
ValueCountFrequency (%)
17.6111
< 0.1%
17.6091
< 0.1%
17.6071
< 0.1%
17.5891
< 0.1%
17.5781
< 0.1%
17.5681
< 0.1%
17.5321
< 0.1%
17.5081
< 0.1%
17.5032
0.1%
17.4911
< 0.1%

LandAndOceanAverageTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct294
Distinct (%)14.8%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.1285321285
Minimum0.042
Maximum0.457
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.1 KiB
2022-09-18T13:20:36.392916image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.042
5-th percentile0.052
Q10.063
median0.122
Q30.151
95-th percentile0.28345
Maximum0.457
Range0.415
Interquartile range (IQR)0.088

Descriptive statistics

Standard deviation0.07358679601
Coefficient of variation (CV)0.5725167462
Kurtosis1.525069706
Mean0.1285321285
Median Absolute Deviation (MAD)0.0535
Skewness1.275594309
Sum256.036
Variance0.005415016546
MonotonicityNot monotonic
2022-09-18T13:20:36.626938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.06149
 
1.5%
0.05947
 
1.5%
0.0645
 
1.4%
0.06244
 
1.4%
0.05741
 
1.3%
0.05839
 
1.2%
0.05637
 
1.2%
0.05435
 
1.1%
0.06335
 
1.1%
0.06431
 
1.0%
Other values (284)1589
49.8%
(Missing)1200
37.6%
ValueCountFrequency (%)
0.0421
 
< 0.1%
0.0431
 
< 0.1%
0.0453
 
0.1%
0.0463
 
0.1%
0.0474
 
0.1%
0.04812
0.4%
0.04913
0.4%
0.0525
0.8%
0.05122
0.7%
0.05227
0.8%
ValueCountFrequency (%)
0.4571
< 0.1%
0.4421
< 0.1%
0.4381
< 0.1%
0.4271
< 0.1%
0.4171
< 0.1%
0.4141
< 0.1%
0.4021
< 0.1%
0.3891
< 0.1%
0.3871
< 0.1%
0.3782
0.1%

Interactions

2022-09-18T13:20:28.838300image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:12.548663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:14.725802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:17.055874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:20.365465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:22.512634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:24.552261image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:26.699958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:29.073448image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:12.815635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:14.961631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:17.605002image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:20.659586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:22.799308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:24.811293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:26.943070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:29.305442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:13.092011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:15.194813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:18.207781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:20.924721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:23.058472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:25.076173image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:27.189881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:29.531789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:13.385625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:15.456785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:18.899094image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:21.204431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:23.317759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:25.338840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:27.425659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:29.744677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:13.644598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:15.721049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:19.163282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:21.487832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:23.574579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:25.567601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:27.652596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:29.984904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:13.950470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:15.969773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:19.463817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:21.781183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:23.824221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:25.918643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:28.154030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:30.211392image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:14.200289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:16.431950image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:19.782032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:22.022288image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:24.087712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:26.218124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:28.377481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:30.425481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:14.495345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:16.781455image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:20.094357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:22.276500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:24.322845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:26.465403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T13:20:28.619429image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-09-18T13:20:36.859198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-18T13:20:37.199542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-18T13:20:37.485998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-18T13:20:37.758601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-18T13:20:30.765490image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-18T13:20:31.132171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-09-18T13:20:31.645129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-09-18T13:20:31.955842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

dtLandAverageTemperatureLandAverageTemperatureUncertaintyLandMaxTemperatureLandMaxTemperatureUncertaintyLandMinTemperatureLandMinTemperatureUncertaintyLandAndOceanAverageTemperatureLandAndOceanAverageTemperatureUncertainty
01750-01-013.0343.574NaNNaNNaNNaNNaNNaN
11750-02-013.0833.702NaNNaNNaNNaNNaNNaN
21750-03-015.6263.076NaNNaNNaNNaNNaNNaN
31750-04-018.4902.451NaNNaNNaNNaNNaNNaN
41750-05-0111.5732.072NaNNaNNaNNaNNaNNaN
51750-06-0112.9371.724NaNNaNNaNNaNNaNNaN
61750-07-0115.8681.911NaNNaNNaNNaNNaNNaN
71750-08-0114.7502.231NaNNaNNaNNaNNaNNaN
81750-09-0111.4132.637NaNNaNNaNNaNNaNNaN
91750-10-016.3672.668NaNNaNNaNNaNNaNNaN

Last rows

dtLandAverageTemperatureLandAverageTemperatureUncertaintyLandMaxTemperatureLandMaxTemperatureUncertaintyLandMinTemperatureLandMinTemperatureUncertaintyLandAndOceanAverageTemperatureLandAndOceanAverageTemperatureUncertainty
31822015-03-016.7400.06012.6590.0960.8940.07915.1930.061
31832015-04-019.3130.08815.2240.1373.4020.14715.9620.061
31842015-05-0112.3120.08118.1810.1176.3130.15316.7740.058
31852015-06-0114.5050.06820.3640.1338.6270.16817.3900.057
31862015-07-0115.0510.08620.9040.1099.3260.22517.6110.058
31872015-08-0114.7550.07220.6990.1109.0050.17017.5890.057
31882015-09-0112.9990.07918.8450.0887.1990.22917.0490.058
31892015-10-0110.8010.10216.4500.0595.2320.11516.2900.062
31902015-11-017.4330.11912.8920.0932.1570.10615.2520.063
31912015-12-015.5180.10010.7250.1540.2870.09914.7740.062